Sains
Malaysiana 53(11)(2024): 3607-3615
http://doi.org/10.17576/jsm-2024-5311-05
Al-Khawarizmi Heuristik bagi Pautan Data dalam Menganggarkan Bilangan Kemalangan Jalan Raya Tidak Terlapor
(A Heuristic Algorithm of Data Linkage in Estimating the Number of
Unreported Traffic Accidents)
ZAMIRA HASANAH ZAMZURI* & NOR WAZIRAH
RADZMAN SHAH
Jabatan Sains Matematik, Fakulti Sains dan Teknologi, Universiti Kebangsaan Malaysia, 43600 UKM Bangi,
Selangor, Malaysia
Diserahkan: 30 April 2024/Diterima: 7 Ogos 2024
Abstrak
Analisis data kemalangan jalan raya adalah sangat penting bagi merancang strategi pencegahan yang optimum serta meminimumkan risiko berlakunya kemalangan. Bilangan kemalangan jalan raya yang dilaporkan sering kali menunjukkan kekerapan sifar yang tinggi, yang dipercayai berasal daripada situasi kemalangan yang tidak dilaporkan. Maka, penganggaran kemalangan tidak dilaporkan adalah amat penting bagi mengelakkan risiko terkurang anggaran dan ketidaktepatan dalam analisis kemalangan jalan raya. Salah satu cara untuk menganggarkan kemalangan tidak dilaporkan ini adalah menerusi perbandingan dua set data
dan kadar entri data yang tidak dapat dipadankan menjadi kadar kemalangan tidak dilaporkan. Kajian ini menggunakan teknik pautan data berkebarangkalian bagi memautkan dua set data kemalangan jalan raya yang berasal daripada laporan polis dan rekod hospital dari Januari sehingga Mac 2011. Satu al-Khawarizmi heuristik dibangunkan berdasarkan keperluan semasa proses pautan data dijalankan. Unsur heuristik ini menitikberatkan proses pautan data secara berperingkat bagi mengenal pasti set pengecam yang tidak unik yang digunakan serta tapisan data yang bersesuaian dan rasional bagi anggaran yang ingin dicapai. Seterusnya penukaran unit bagi setiap entri data dari per individu ke per kemalangan juga diperlukan kerana matlamat akhir adalah untuk memperoleh jumlah kemalangan yang tidak dilaporkan. Pautan data yang dijalankan dalam kajian ini menggunakan pengecam bukan unik seperti jantina, umur, bangsa dan jenis kenderaan. Berdasarkan data yang dipautkan dan proses penganggaran yang dilaksanakan, dianggarkan sekitar 68% kemalangan adalah tidak dilaporkan dengan bilangan sebanyak 6366. Al-Khawarizmi heuristik yang dibangunkan ini dapat digunakan untuk pautan data kemalangan jalan raya antara laporan polis dan rekod hospital di Malaysia. Perbandingan antara kemalangan yang dilaporkan dan tidak dilaporkan dalam data hospital turut mendedahkan bahawa kebanyakan kemalangan yang tidak dilaporkan melibatkan kesalahan jenayah serius seperti penggunaan dadah dan alkohol berlebihan.
Kata kunci: Heuristik; kemalangan jalan raya; pautan data; tidak dilaporkan
Abstract
Traffic accident data analysis is vital in
order to plan for optimal preventive measures and minimizing the risk of
accident occurrence. Oftentime, the traffic accident
count data exhibit extra zeros, believed to be sourced from underreporting
scenarios. Hence, the estimation of unreported accidents is needed to avoid
under-estimation risk and inaccuracy of traffic accident data analysis. One of
the ways in estimating the unreported accidents is through comparing two data
sets and the proportion of unmatched entries is estimated to be the
underreporting rate. In this study, the probabilistic data linkage techniques is used to link two traffic accident data sets sourced from
police report and hospital records from Jan to Mar 2011. A heuristic algorithim is developed based on the needs found during the
linkage process. The heuristic elements can be found in the staged data linkage
process to establish the best set of non-unique identifiers and also on
identifying suitable and rational filtered data to be used in the estimation.
Then, the unit for data entry needs to be converted from per individual to per
accident, since the ultimate aim of this study was to estimate the number of
unreported accidents. In the performed data linkage process, the non-unique
identifiers used are gender, age, race and vehicle type. Based on the linked
data and estimation process performed, the estimate of unreported accidents is
around 68% and the estimated number of reported accident is 6366. The developed algorithm can be used in linking traffic accident data
based on police report and hospital record in Malaysia. The comparison of reported
and unreported accidents in the hospital record shows that most unreported
accidents are involving serious offences such as excessive drug and alcohol
usage.
Keywords: Data linkage; heuristic; traffic accidents; unreported
RUJUKAN
Ahmed, S.K., Mohammaed,
M.G., Abdulqadir, S.O., Abd El-Kader, R.G., El-Shall,
N.A., Chandran, D., Ur Rehman, M.E. & Dhama, K.
2023. Road traffic accidental injuries and deaths: A neglected global health
issue. Health Science Report 6(5): e1240. doi:
10.1002/hsr2.1240
Ali Omar, Z., Zamzuri, Z.H., Mohd Ariff, N. & Abu Bakar, M.A. 2023. Training data
selection for record linkage classification. Symmetry 15(5): 1060.
Boufous, S., Finch, C., Hayen, A. & Williamson,
A. 2008. Data Linkage of Hospital and Police Crash Datasets in NSW.
Technical Report. Sydney: NSW Injury Risk Management Research Centre,
University of New South Wales.
Dale, S. 2015. Heuristics and
biases: The science of decision making. Business
Information Review 32(2): 93-99.
David, I., Vangheluwe, H. & Syriani, E.
2023. Model consistency as a heuristic for eventual correctness. Journal of Computer Languages 76:
101223.
Isa, Z. & Zamzuri, Z.H. 2022. Pengukuran risiko menggunakan Rangkaian Bayesan: Aplikasi kepada data perlanggaran kapal di Malaysia. Sains Malaysiana 51(7): 2305-2314
Kamaluddin, N.A., Abd Rahman, M.F. & Várhelyi, A. 2018. Matching of police and hospital road
crash casualty records - a data-linkage study in Malaysia. International
Journal of Injury Control and Safety Promotion 26(1): 52-59. doi:10.1080/17457300.2018.1476385
Kementerian Pengangkutan Malaysia. 2019. Statistik Pengangkutan Malaysia.
https://www.mot.gov.my/en/Statistik%20Tahunan%20Pengangkutan/Transport%20Statistics%20Malaysia%202019.pdf
(Diakses pada 1 Ogos 2024).
Khodabakhshian, A., Puolitaival, T. & Kestle,
L. 2023. Deterministic and probabilistic risk management approaches in
construction projects: A systematic literature review and comparative analysis. Buildings 13(5): 1312.
Mack, C. 2014. PS1-13:
Probabilistic linkage (also known as “fuzzy matching”): The theoretical
foundations of modern record linkage. Clinical
Medicine and Research 12(1-2): 95.
Maxwell, O., Mayowa, B.A., Chinedu, I.U. &
Peace, A.E. 2018. Modelling count data; A generalized linear model framework. American
Journal of Mathematics and Statistics 8(6): 179-183.
Microsoft Learn. 2024. Power Query M Formula
Language. https://learn.microsoft.com/en-us/powerquery-m/ (Diakses pada 1 Ogos 2024).
Mosleh, M.A.A., Assiri,
A., Gumaei, A.H., Alkhamees,
B.F. & Al-Qahtani, M. 2024. A bidirectional Arabic sign language framework
using deep learning and fuzzy matching score. Mathematics 12(8): 1155.
Muni, K.M., Ningwa,
A., Osuret, J., Zziwa,
E.B., Namatovu, S., Biribawa,
C., Nakafeero, M., Mutto,
M., Guwatudde, D., Kyamanywa,
P. & Kobusingye, O. 2021. Estimating the burden
of road traffic crashes in Uganda using police and health sector data sources. Injury
Prevention 27: 208-214.
Nik Zamri, N.S.
& Zamzuri, Z.H. 2019. Estimating the proportion
of non-fatality unreported traffic accidents in Malaysia. ASM Sc. J. 12(1):
239-245.
Nik Zamri,
N.S., Zamzuri, Z.H. & Ibrahim, K. 2018. Factors
influencing Malaysian drivers' tendency on underreporting. International Journal of Engineering and Technology 7(4):
6313-6321.
Radzman Shah, N.W. & Zamzuri, Z.H. 2023. Underreporting of road traffic accidents: A
bibliometric analysis from Web of Science database. Journal of Quality
Measurement and Analysis 19(3): 55-71.
Samuel, J.C., Sankhulani,
E., Qureshi, J.S., Baloyi, P., Thupi, C., Lee, C.N.,
Miller, W.C., Cairns, B.A. & Charles, A.G. 2012. Under-reporting of road
traffic mortality in developing countries: Application of a capture-recapture
statistical model to refine mortality estimates. PloS ONE 7(2): e31091.
Shin, D., Rasul, A. &
Fotiadis, A. 2021. Why am I seeing this? Deconstructing algorithm literacy
through the lens of users. Internet
Research 32: 1214-1234.
Shinar, D., Valero-Mora, P.,
van Strijp-Houtenbos, M., Haworth, N., Schramm, A.,
Bruyne, G.D., Cavallo, V., Chliaoutakis, J., Dias,
J., Frraro, O.E., Fyhri,
A., Sajatovic, A.H., Kuklane,
K., Ledesma, R., Mascarell, O., Morandi,
A., Muser, M., Otte, D., Papadakaki, M., Sanmartín, J., Dulf, D., Saplioglu, M. & Tzamalouka, G. 2018. Under-reporting bicycle accidents to
police in the COST TU1101 international survey: Cross-country comparison and
associated factors. Accident, Analysis
and Prevention 110: 177-186.
Singh, P., Laksmi, P.V.M., Prinja, S. & Khanduja, P. 2018. Under-reporting of road traffic
accidents in traffic police records - A cross-sectional study from North India. International Journal of Community
Medicine and Public Health 5(2): 579-584.
Ward, H., Lyons, R. & Thoreau, R. 2006. Under-reporting
of Road Casualties? Phase 1. Road Safety Research Report No. 69. London:
Department for Transport.
Watson, A., Vallmuur, K. & Watson, B. 2015. How serious are they? The use of data linkage to explore different
definitions of serious road crash injuries. Proceedings of the 2015
Australasian Road Safety Conference in Gold Coast, Australia. hlm. 1-10.
World Health Organization (WHO). 2023. Road
traffic injuries.
https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries (Diakses pada 1 Ogos 2024).
Ytterstad, B., Gressnes,
T. & Harborg, T. 2018. PW 1663 Injury
surveillance in a hospital leads to complete traffic injury data, sustainable
injury prevention, and update police underreporting. Injury Prevention 24(2): A179.
Zamzuri, Z.H. 2021. Underreporting traffic accidents in
Malaysia-A sentiment analysis. ITM Web of Conferences 36: 01015.
*Pengarang untuk surat-menyurat; email: zamira@ukm.edu.my
|